Search CORE

253 research outputs found

Levenshtein distances fail to identify language relationships accurately

Author: Greenhill Simon
Publication venue: 'MIT Press - Journals'
Publication date: 11/12/2015
Field of study

The Levenshtein distance is a simple distance metric derived from the number of edit operations needed to transform one string into another. This metric has received recent attention as a means of automatically classifying languages into genealogical subgroups. In this article I test the performance of the Levenshtein distance for classifying languages by subsampling three language subsets from a large database of Austronesian languages. Comparing the classification proposed by the Levenshtein distance to that of the comparative method shows that the Levenshtein classification is correct only 40% of the time. Standardizing the orthography increases the performance, but only to a maximum of 65% accuracy within language subgroups. The accuracy of the Levenshtein classification decreases rapidly with phylogenetic distance, failing to discriminate homology and chance similarity across distantly related languages. This poor performance suggests the need for more linguistically nuanced methods for automated language classification tasks

The Australian National University

POLLEX-Online: The Polynesian Lexicon Project Online

Author: Ross Clark
Simon J. Greenhill
Publication venue: 'Project Muse'
Publication date: 01/01/2011
Field of study

Crossref

MPG.PuRe

Basic vocabulary and Bayesian phylolinguistics

Author: Gray Russell D
Greenhill Simon J
Publication venue: 'John Benjamins Publishing Company'
Publication date: 11/12/2015
Field of study

Donohue et al.’s critique of our work on the origins and spread of the Austronesian language family is marred by misunderstandings of our approach. We respond to these by noting that our Bayesian phylogenetic approach: (1) distinguishes between retentions and innovations probabilistically, (2) focuses on basic vocabulary not ‘the lexicon’, (3) eliminates known loanwords, (4) produces results that are congruent with the results of the comparative method and conflict with the scenarios requiring unprecedented amounts of language shift postulated by Donohue et al

The Australian National University

A lexicostatistical study of the Khasian languages: Khasi, Pnar, Lyngngam, and War

Author: Greenhill Simon
Nagaraja K S
Sidwell Paul
Publication venue: Mahidol University & SIL International
Publication date: 15/11/2020
Field of study

This paper presents the results of lexicostatistical, glottochronological, and Bayesian phylogenetic analyses of a 200 word data set for Standard Khasi, Lyngngam, Pnar and War. Very few works have appeared on the subject of the internal classification of the Khasian branch of Austroasiatic, leaving the existing reference literature disappointingly incomplete. The present analysis supports both the strong identity of Khasian as a unitary branch, with an internally nested branching structure that fits neatly with known historical, geographical and linguistic facts. Additionally, lexically based dating methods suggest that the internal diversification of Khasian began roughly between 1500 and 2000 years ago.Copyright Information: Copyright for this paper vested in the authors. Released under Creative Commons Attribution Licens

The Australian National University

Incorporating contextual audio for an actively anxious smart home

Author: Greenhill Stewart
Moncrieff Simon
Venkatesh Svetha
West Geoff
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Deakin Research Online

Why do religious cultures evolve slowly? The cultural evolution of cooperative calling and the historical study of religions

Author: Atkinson Quentin
Bulbulia Joseph
Gray Russell
Greenhill Simon
Publication venue: 'Informa UK Limited'
Publication date: 27/12/2020
Field of study

Collective representations are the result of an immense cooperation, which stretches out not only into space but into time as well; to make them, a multitude of minds have associated, united and combined their ideas and sentiments: for them, long generations have accumulated their experience and their knowledge. A special intellectual activity is therefore concentrated in them, which is infinitely richer and complexer than that of the individual. (Émile Durkheim, Elementary Forms of the Religious Life, [1912] 1965: 29)The languages and folkways of ancient peoples hold little relevance for us, except in one respect: the religions of the ancient world remain our religions. Though religions change, core features of the scriptures and rituals of the world's most popular religious traditions appear to have been conserved with remarkably high fidelity. We explain slow religious change from how religion facilitates cooperation at large social scales. At the end, we clarify how historians of religion, in collaboration with psychologists and computational biologists, might test and improve explanations such as ours.This research was supported by the John F. Templeton Foundation (Testing the Functional Roles of Religion in Human Society, no. 28745), the Royal Society of New Zealand ("e Cultural Evolution of Religion, no. 11-UOA-23

The Australian National University

Population structure and cultural geography of a folktale in Europe.

Author: Atkinson Quentin D
Greenhill Simon J
Ross Robert M
Publication venue: 'The Royal Society'
Publication date: 11/12/2015
Field of study

Despite a burgeoning science of cultural evolution, relatively little work has focused on the population structure of human cultural variation. By contrast, studies in human population genetics use a suite of tools to quantify and analyse spatial and temporal patterns of genetic variation within and between populations. Human genetic diversity can be explained largely as a result of migration and drift giving rise to gradual genetic clines, together with some discontinuities arising from geographical and cultural barriers to gene flow. Here, we adapt theory and methods from population genetics to quantify the influence of geography and ethnolinguistic boundaries on the distribution of 700 variants of a folktale in 31 European ethnolinguistic populations. We find that geographical distance and ethnolinguistic affiliation exert significant independent effects on folktale diversity and that variation between populations supports a clustering concordant with European geography. This pattern of geographical clines and clusters parallels the pattern of human genetic diversity in Europe, although the effects of geographical distance and ethnolinguistic boundaries are stronger for folktales than genes. Our findings highlight the importance of geography and population boundaries in models of human cultural variation and point to key similarities and differences between evolutionary processes operating on human genes and culture

The Australian National University

Scintillation in the Circinus Galaxy water megamasers

Author: David L. Jauncey
Gardner F. F.
James E. J. Lovell
Jamie N. McCallum
Lincoln J. Greenhill
Rickett B. J.
Simon P. Ellingsen
Publication venue: 'University of Chicago Press'
Publication date: 11/11/2004
Field of study

We present observations of the 22 GHz water vapor megamasers in the Circinus galaxy made with the Tidbinbilla 70m telescope. These observations confirm the rapid variability seen earlier by Greenhill et al (1997). We show that this rapid variability can be explained by interstellar scintillation, based on what is now known of the interstellar scintillation seen in a significant number of flat spectrum AGN. The observed variability cannot be fully described by a simple model of either weak or diffractive scintillation.Comment: 10 pages, 5 figures. AJ accepte

arXiv.org e-Print Archive

Crossref

How Accurate and Robust Are the Phylogenetic Estimates of Austronesian Language Relationships?

Author: Alexei J. Drummond
Dale J. Hedges
Russell D. Gray
Simon J. Greenhill
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

We recently used computational phylogenetic methods on lexical data to test between two scenarios for the peopling of the Pacific. Our analyses of lexical data supported a pulse-pause scenario of Pacific settlement in which the Austronesian speakers originated in Taiwan around 5,200 years ago and rapidly spread through the Pacific in a series of expansion pulses and settlement pauses. We claimed that there was high congruence between traditional language subgroups and those observed in the language phylogenies, and that the estimated age of the Austronesian expansion at 5,200 years ago was consistent with the archaeological evidence. However, the congruence between the language phylogenies and the evidence from historical linguistics was not quantitatively assessed using tree comparison metrics. The robustness of the divergence time estimates to different calibration points was also not investigated exhaustively. Here we address these limitations by using a systematic tree comparison metric to calculate the similarity between the Bayesian phylogenetic trees and the subgroups proposed by historical linguistics, and by re-estimating the age of the Austronesian expansion using only the most robust calibrations. The results show that the Austronesian language phylogenies are highly congruent with the traditional subgroupings, and the date estimates are robust even when calculated using a restricted set of historical calibrations

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The Australian National University

MPG.PuRe

CLICS² An Improved Database of Cross-Linguistic Colexifications : Assembling Lexical Data with the Help of Cross-Linguistic Data Formats

Author: Anderson¹ Cormac
Forkel¹ Robert
Greenhill¹² Simon,
List¹ Johann-Mattis
Mayer³ Thomas
Tresoldi¹ Tiago
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2018
Field of study

International audienceThe Database of Cross-Linguistic Colexifications (CLICS), has established a computer-assisted framework for the interactive representation of cross-linguistic colexification patterns. In its current form, it has proven to be a useful tool for various kinds of investigation into cross-linguistic semantic associations , ranging from studies on semantic change, patterns of conceptualization, and linguistic pale-ontology. But CLICS has also been criticized for obvious shortcomings, ranging from the underlying dataset, which still contains many errors, up to the limits of cross-linguistic colexification studies in general. Building on recent standardization efforts reflected in the Cross-Linguistic Data Formats initiative (CLDF) and novel approaches for fast, efficient, and reliable data aggregation, we have created a new database for cross-linguistic colexifications, which not only supersedes the original CLICS database in terms of coverage but also offers a much more principled procedure for the creation, curation and aggregation of datasets. The paper presents the new database and discusses its major features

The Australian National University

MPG.PuRe